Description of data and programs
Replication for:
How important are banks for development? National banks in the United States 1870-1900
Review of Economics and Statistics
Scott L. Fulford
scott.fulford@bc.edu
27 June 2015

(1) Programs used:
Original National Bank Accounts in Microsoft Excel 97
Converted to Stata version 13.1
All analysis in Stata version 13.1, running on Microsoft Windows 8.1
ArcGIS 

(2) For those interested in analysis of national banks, two data sets will be of particular use (their creation is described in section 3):

* National_Banks_counties1890.dta places national banks in 1890 counties and combines it with census information. Everything is aggregated at the county level.

* National_Bank_city.dta instead has information on individual national banks and their location.

(3) Creation of National Banks data file

(3a) National_banks1.do combines accounts stored in National_Bank_Accounts.xls which are created from the reports of the Comptroller of the Currency. 
Output: National_Bank_Accounts.dta

(3b) National_banks2.do matches the name of the location of the national bank with Geographic Names Information System compiled by the USGS available at
http://geonames.usgs.gov/domestic/download_data.htm
Output: National_Bank_city.dta 
		National_Bank_city1870.csv
		National_Bank_city1880.csv
		National_Bank_city1890.csv
		National_Bank_city1900.csv
		National_Bank_city1902.csv

(3c) National_banks3.do loads county level census information from the National Historical GIS (NHGIS) http://www.nhgis.org/
Output: census_`year'_county.dta

(3d) Use ArcGIS to locate each city within 1890 county shapefile obtained from NHGIS. National_banks4.do combines all bank files located in counties with census county files created by National_Banks3.do.
Output: National_Banks_counties1890.dta

(3e) National_Banks5.do provides some additional cleanup and calculates distances.
Output: National_Banks_counties1890.dta

(4) Analysis of National Banks

(4a) National_bank_analysis1.do transforms variables into per capita or log terms, and produces summary statistics for Table 1. 
Output: National_Banks_counties1890_addvar.dta

(4b) National_banks_analysis_plots1.do produces reduced form analysis in Table 3, the histogram "Figure 1: Distribution of national bank capital stock in U.S. counties," and the scatter "Figure 2: National bank capital stock in U.S. counties"

(4c) National_banks_analysis2.do performs the log-likelihood estimation shown in "Table 2: Log-likelihood estimates of optimal capital" and then based on these parameters creates the multiple imputation of the unobserved capital stock estimation. Different variations on the underlying sample ("allrural" or "unionrural"). The estimation performs the analysis both allowing and excluding the 25 capital banks in 1902. Finally, the program performs a Gibbs Sampler to draw from the distribution implied by the estimates and creates:
MI_National_banks_allrural_v0
MI_National_banks25_allrural_v0
MI_National_banks_unionrural_v0
MI_National_banks25_unionrural_v0
The v3 MI which excludes scale effects in the robustness can be created by removing ln_tpop in the MLE.
 
(4d)  National_banks_analysis3_mean.do produces the main analysis. It loops over different window sizes (15 and 10), samples (allrural, unionrural, including 25, excluding 25) and outcomes to produce tables "Table 4: Banks and total production per capita", "Table 5: Banks and the mix of production", "Table 6: Robustness to sample changes and distributional assumptions" and supplemental appendix tables "Table S.2: Additional robustness variations" and "Table S.3: Distance to banking." Some robustness options are commented out and must be uncommented to produce the analysis. Note that the full analysis takes a long time to run.

(4e) National_banks_analysis4_v2.do performs additional analysis and robustness on the national banks multiple imputation data. It creates "Figure 3: Draws from optimal capital and county characteristics", as well as Supplemental Appendix figures "Figure S.2: Optimal capital stock and distance from St. Louis" and "Figure S.3: The conditional distribution of optimal capital stock in four counties"